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Abstract 

We study the problem of scheduling m tasks to n selfish, unrelated machines in order 
to minimize the makespan, where the execution times are independent random variables, 
identical across machines. We show that the VCG mechanism, which myopically allocates 
each task to its best machine, achieves an approximation ratio of O () ■ This improves 
significantly on the previously best known bound of O (^) for prior-independent mecha¬ 
nisms, given by Chawla et al. [STOCT3] under the additional assumption of Monotone 
Hazard Rate (MHR) distributions. Although we demonstrate that this is in general tight, 
if we do maintain the MHR assumption, then we get improved, (small) constant bounds for 
m > nlnn i.i.d. tasks, while we also identify a sufficient condition on the distribution that 
yields a constant approximation ratio regardless of the number of tasks. 


1 Introduction 

We consider the problem of scheduling tasks to machines, where the processing times of the tasks 
are stochastic and the machines are strategic. The goal is to minimize the expected completion 
time (a.k.a. makespan) of any machine, where the expectation is taken over the randomness 
of the processing times and the possible randomness of the mechanism. We are interested in 
the performance, i.e. the expected makespan, of truthful mechanisms compared to the optimal 
algorithm that does not take the incentives of the machines into consideration. This problem, 
which we call the Bayesian scheduling problem, was previously considered by Chawla et al. 
[S]. Scheduling problems constitute a very rich and intriguing area of research [21]. In one 
of the most fundamental cases, the goal is to schedule m tasks to n parallel machines while 
minimizing the makespan, when the processing times of the tasks are selected by an adversary 
in an arbitrary way and can depend on the machine to which they are allocated. However, 
the assumption that the machines will blindly follow the instructions of a central authority 
(scheduler) was eventually challenged, especially due to the rapid growth of the Internet and 
its use as a primary computing platform. Motivated by this, in their seminal paper Nisan and 
Ronen [29] introduced a mechanism-design approach to the scheduling problem: the processing 
times of the tasks are now private information of the machines, and each machine declares to 
the mechanism how much time it requires to execute each task. The mechanism then outputs 
the allocation of tasks to machines, as well as monetary compensations to the machines for their 
work, based solely on these declarations. In fact, the mechanism has to decide the output in 
advance, for any possible matrix of processing times the machines can report. Each machine is 
assumed to be rational and strategic, so, given the mechanism and the true processing times, 
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its declarations are chosen in order to minimize the processing time/cost it has to spend for the 
execution of the allocated tasks minus the payment it will receive. In this scenario, the goal is 
to design a truthful mechanism that minimizes the makespan; truthful mechanisms define the 
allocation and payment functions so that the machines don’t have an incentive to misreport 
their true processing-time capabilities. We will refer to this model as the prior free scheduling 
problem, as opposed to the stochastic model we discuss next. 

In the Bayesian scheduling problem [S], the time a specific machine requires in order to 
process a task is drawn from a distribution. We consider one of the fundamental questions 
posed by the algorithmic mechanism design literature, which is about quantifying the potential 
performance loss of a mechanism due to the requirement for truthfulness. In the Bayesian 
scheduling setting, this question translates to: What is the maximum ratio (for any distribution 
of proeessing times) of the expeeted makespan of the best truthful meehanism over the expeeted 
optimal makespan (ignoring the maehines’ ineentives) ? 

In this paper we tackle this question by considering a well known and natural truthful 
mechanism, the Viekrey-Clarke-Groves meehanism (VCG) |341 lll| [20 ] . VCG can be defined for 
very general mechanism design settings. In the special case of scheduling unrelated machines, it 
has a very simple interpretation: greedily and myopically allocate each task to a machine that 
minimizes its processing time. It is a well known fact that VCG is a truthful mechanism in a very 
strong sense; truth-telling is a dominant strategy for the machines. Because of the notorious 
lack of characterization results for truthfulness for restricted domains such as scheduling, VCG 
(or more generally, affine maximizers) is the standard and obvious choice to consider for the 
Bayesian scheduling problem. We stress here that for the scheduling domain (and for any 
additive domain) the VCG allocation and payments can be computed in polynomial time. Also, 
it is important to note that VCG is a prior-independent mechanism, i.e. it does not require any 
knowledge of the prior distribution from which the processing times are drawn. 

Prior-independence is a very strong property, and is an important feature for mechanisms 
used in stochastic settings. Being robust with respect to prior distributions facilitates applica¬ 
bility in real systems, while at the same time bypasses the pessimistic inapproximability results 
of worst case analysis. The idea is that we would like the mechanisms we use, without relying 
on any knowledge of the distribution of the processing times of the tasks, to still perform well 
compared to the optimal mechanism that is tailored for the particular distribution. 

Chawla et al. [5] were the first to examine the Bayesian scheduling problem while considering 
the importance for prior-independence. They study the following two mechanisms: 

Bounded overload with parameter c Allocate tasks to machines such that the sum of the 
processing times of all tasks is minimized, subject to placing at most tasks at any 
machine. 

Sieve and bounded overload with parameters c,/3, and 5 Fix a partition of the machines 
into two sets of sizes (1 — 6)n and 6n. Ignoring all processing times which exceec|^/3 (i.e. 
setting them equal to infinity), run VGG on the first set of machines. For the tasks that 
remain unallocated run the bounded overload mechanism with parameter c on the second 
set of machines. 

The above mechanisms are inspired by maximal-in-range (affine maximizer) mechanisms [3U] 
and threshold mechanisms, as these are essentially the only non-trivial truthful mechanisms we 
know for the scheduling domain. One would expect that the simplest of these mechanisms, 
which is the VGG mechanism, would be the first to be considered. Indeed, VGG is the most 
natural, truthful, simple, polynomial time computable, and prior-independent mechanism. Still, 

^Assume you run VCG on the first set of machines plus a dummy machine with processing time /3 on all tasks. 
The case where a task has processing time equal to /3 can be ignored without loss of generality for the case of 
continuous distributions. 


2 



the authors in [S] design the above mechanisms in an attempt to prevent certain bad behaviour 
that VCG exhibits on a specific input instance and don’t examine VCG beyond that point. As 
we demonstrate in this paper, however, this specific instance actually constitutes the worst case 
scenario for VGG and we can identify cases where VGG performs considerably better, either 
by placing a restriction on the number of tasks or by making some additional distributional 
assumptions. 

Our results. We prove an asymptotically tight bound of 0 for the approximation 

ratio of VGG for the Bayesian scheduling problem under the sole assumption that the machines 
are a priori identical. This bound is achieved by showing that the worst case input for VGG is 
actually one where the tasks are all of unit weight (point mass distributions). This resembles a 
balls-in-bins type scenario from which the bound is implied. 

Whenever the processing times of the tasks are i.i.d. and drawn from an MHR continu¬ 
ous distribution, VGG is shown to be 2 ^ ) -approximate for the Bayesian scheduling 

problem. This immediately implies a constant bound at most equal to 4 when m > nlnn. 
We also get an improved bound of 1 -|- \/2 when m > using a different approach. For the 
complementary case of m < nlnn, we identify a property of the distribution of processing 
times such that VGG again achieves a constant approximation. We observe that important 
representatives of the class of MHR distributions, that is the uniform distribution on [0,1] as 
well as exponential distributions, do satisfy this property, so for these distributions VGG is 
4-approximate regardless of the number of tasks. We note however that this is not the case for 
all MHR distributions. 

The continuity assumption plays a fundamental role in the above results. In particular, 
we give a lower bound of H ( in’iiTn ) i.i.d. processing times that uses a discrete 

MHR distribution. Finally, we also consider the bounded overload and the sieve and bounded 
overload mechanisms that were studied by Ghawla et al. [8], and present some instances that 
lower-bound their performance. 

Related Work. One of the fundamental papers on the approximability of scheduling with 
unrelated machines is by Lenstra et al. [2S] who provide a polynomial time algorithm that 
approximates the optimal makespan within a factor of 2. They also prove that it is NP- 
hard to approximate the optimal makespan within a factor of 3/2 in this setting. In the 
mechanism design setting, Nisan and Ronen [5^ prove that the well known VGG mechanism 
achieves an n-approximation of the optimal makespan, while no truthful mechanism can achieve 
approximation ratio better than 2. Note that the upper bound immediately carries over to the 
Bayesian and the prior-independent scheduling case. The lower bound has been improved by 
Ghristodoulou et al. m and Koutsoupias and Vidali [23] to 2.61, while Ashlagi et al. |2] prove 
the tightness of the upper bound for deterministic anonymous mechanisms. In contrast to 
the negative result on the prior free setting presented in [2], truthful mechanisms can achieve 
sublinear approximation when the processing times are stochastic. In fact, we prove here that 
VGG can achieve a sublogarithmic approximation, and even a constant one for some cases, 
while similar bounds for other mechanisms have also been presented by Ghawla et al. [S]. 

For the special case of related machines, where the private information of each machine 
is a single value. Archer and Tardos [1] were the first to give a 3-approximation truthful in 
expectation mechanism, while by now truthful PTAS are also known [91II1II7|. Putting com¬ 
putational considerations aside, the best truthful mechanism in this single-dimensional setting 
is also optimal. Lavi and Swamy [23] managed to prove constant approximation ratio for a 
special, yet multi-dimensional scheduling problem; they consider the case where the processing 
times of each task can take one of two fixed values. Yu |3S| then generalized this result to 
two-range-values, while together with Lu and Yu m and Lu m, they gave constant (better 
than 1.6) bounds for the case of two machines. 
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Daskalakis and Weinberg m consider computationally tractable approximations with re¬ 
spect to the best (Bayesian) truthful mechanism when the processing times of the tasks follow 
distributions (with finite support) that are known to the mechanism designer. In fact the au¬ 
thors provide a reduction of this problem to an algorithmic problem. Chawla et al. [7j showed 
that there can be no approximation-preserving reductions from mechanism design to algorithm 
design for the makespan objective, however the authors in m bypass this inapproximability 
by considering the design of bi-criterion approximation algorithms. 

Prior-independent mechanisms have been mostly considered in the context of optimal auc¬ 
tion design, where the goal is to design an auction mechanism that maximizes the seller’s 
revenue. Inspired by the work of Dhangwatnotai et al. Devanur et al. m and Roughgar- 
den et al. [32] independently provide approximation mechanisms for multi-dimensional settings, 
with recent follow-up work by Goldner and Karlin m and Azar et al. |1]. Moreover, Dughmi 
et al. m identify conditions under which VCG obtains a constant fraction of the optimal rev¬ 
enue, while Hartline and Roughgarden |22| prove Bulow-Klemperer type results for VGG. Prior 
robust optimization is also discussed by Sivan [33] . 

Ghawla et al. [5] are the first to consider prior-independent mechanisms for the (Bayesian) 
scheduling problem. They introduce two variants of the VGG mechanism and bound their ap¬ 
proximation ratios. In particular, the bounded overload mechanism is prior-independent and 
achieves a 0(™) approximation of the expected optimal makespan when the processing times 
of the tasks are drawn from machine-identical MHR distributions. For the case where the pro¬ 
cessing times of the tasks are i.i.d. from an MHR distribution, the authors prove that sieve and 
bounded overload mechanisms can achieve an 0(\/lnn) approximation of the expected optimal 
makespan, as well as an approximation ratio of 0((lnlnn)^) under the additional assumption 
that there are at least nlnn tasks. We note that to achieve these improved approximation 
ratios, a sieve and bounded overload mechanism needs to have access to a small piece of infor¬ 
mation regarding the distribution of the processing times, in particular the expectation of the 
minimum of a certain number of draws (in contrast to VGG which requires no distributional 
information whatsoever). 

The VGG mechanism is strongly represented in the above works. Its simplicity and amenabil¬ 
ity to practice strongly motivate a detailed analysis of its performance for the Bayesian schedul¬ 
ing problem. From our results, it turns out that in general VGG performs better than the 
previously analyzed prior-independent mechanisms, applies to wider settings with less restric¬ 
tions on the distributions and, of course, it is simpler. To summarize and clarify this comparison 
with the previous prior-free mechanisms of Ghawla et al. [S], we note that the only case where 
VGG demonstrates a worse approximation ratio is when the number of tasks is asymptotically 
very close to that of machines, in particular m = o and, in addition, we are in a 

restricted setting where the execution times have to be drawn from necessarily non-identieal, 
MHR distributions. For example, for m = n tasks with processing times drawn from machine- 
identical MHR distributions which however differ across tasks, the bounded overload mechanism 
of Ghawla et al. [S] would be constant 0(l)-approximate, while VGG would have an approxi¬ 
mation ratio of 0 • However, a point worth mentioning here is that the constant hidden 

within the 0(l)-notation above is 800 while the one in the upper-bound O of VGG 

comes directly from a balls-in-bins analysis and therefore is 1 -|- o(l). 

2 Preliminaries and Notation 

Assume that we have n unrelated parallel machines and m > n tasks that need to be scheduled 
to these machines. Let Uj denote the processing time of task j on machine i. In the Bayesian 
scheduling problem, each tij is independently drawn from some probability distribution Hij. In 
this paper we mainly consider the machine-identical setting, that is the processing times of a 
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specific task j are drawn from the same distribution Dj for all the machines. This is a standard 
assumption for the problem (see also 0). We also consider the case where both machines and 
tasks are considered a priori identical, and the processing times tij are all i.i.d. drawn from the 
same distribution D. The goal is to design a truthful mechanism that minimizes the expected 
makespan of the schedule. 

We consider the VCG mechanism, the most natural and standard choice for a truthful 
mechanism. Thus, we henceforth assume that the machines always declare their true processing 
times. VCG minimizes the total workload by allocating each task to the machine that minimizes 
its processing time. So, if a denotes the allocation function of VCG (we omit the dependence 
on t for clarity of presentation) then, for any task j, aij = 1 for some machine i such that 
tij = minj/{tj/j}, otherwise = 0. Without loss of generality we assume that in case of a tie, 
the machine is chosen uniformly at randorr0 The expected makespan of VCG is then computed 
as 

m 


E[VCG(t)] =E 


max 


i=i 


In what follows, we use variable Yij to denote the processing time of task j on machine i under 
VCG, that is Vj = aijUj. We also denote by 1) = Yi,j the workload of machine i. 

Note that in the machine-identical setting Uij = 1 with probability ^ for any task j. So, 
VCG exhibits a balls-in-bins type behaviour in this setting, as the machine that minimizes the 
processing time of each task, and hence, the machine that will be allocated the task, is chosen 
uniformly at random for each task. We thus know from traditional balls-in-bins analysis, that 
the expected maximum number of tasks that will be allocated to any machine will be 0 , 

whenever m = 0(n). For more precise balls-in-bins type bounds see Raab and Steger [31]. We 
will use the following theorem to prove in Section that the above instance that yields the 
0 bound is actually the worst case scenario for VCG: 


Theorem 1 (Berenbrink et al. |6]). Assume two vectors w G W^, w' G with m < m' and 
their values in non-increasing order (that is wi > W 2 >■■■ > Wm and w'^ > W 2 > ■ ■ ■ > 

If the following two conditions hold: 

(i) = 

(ii) Ei=i Wj > J2j=i Wj for all k £ [m], 

then the expected maximum load when allocating m halls with weights according to w is at 
least equal to the expected maximum load when allocating m' halls with weights according to w', 
uniformly at random to the same number of bins. 


Following |3] we say that vector w majorizes w' whenever w and w' satisfy Conditions (i) 
and (ii) of Theorem 


Probability preliminaries. We now give some additional notation regarding properties of 
distributions that will be used in the analysis. 

Let T be a random variable following a probability distribution P. Assuming we perform n 
independent draws from P, we use r[r : n] to denote the r-th order statistic (the r-th smallest) 
of the resulting values, following the notation from |5|. In particular, r[l : n] will denote 
the minimum of n draws from P, while r[l : n][m : m] denotes the maximum value of m 
independent experiments where each one is the minimum of n draws from P. Note that for 
tij ~ Pj, the expected processing time of machine i for task j under VCG is 

E[Vjj] = Pr [aij = 1] E [Uj | = 1] = ^ E[r[l : n]]. (1) 

^We note here that for continuous distributions, such events of ties occur with zero probability. 
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In this work we also consider the class of probability distributions that have a monotone 
hazard rate (MHR). A continuous distribution with pdf / and cdf F is MHR if its hazard rate 
h{x) = is a (weakly) increasing function. The definition of discrete MHR distributions 

is similar, only the hazard rate of a discrete distribution is defined as h{x) = pr[x>xj 
e.g. Barlow et al. 0)- The following two technical lemmas demonstrate properties of MHR 
distributions. The proofs can be found in Appendix |A| 

Lemma 1. If T is a eontinuous MHR random variable, then for every positive integer n, its 
first order statistie T[1 : n] is also MHR. 

Lemma 2. For any continuous MHR random variable X and any positive integer r, E[A^] < 
r!E[A]T 

We now introduce the notion of /c-stretched distributions. The property that identifies these 
distributions plays an important role in the approximation ratio of VCG as we will see later in 
the analysis (Theorem]^. 

Definition 1. Given a function k over integers, we call a distribution k-stretehed if its order 
statistics satisfy 

E[T[1 : n][n : n]] > k{n) ■ E[r[l : n]], 

for all positive integers n. 

We will use the following result by Aven to bound the expected makespan of VGG. 

Theorem 2 (Aven [3]). If Xi, X 2 , ■ ■ ■, Xn are (not neeessarily independent) random variables 
with mean pL and varianee then 

E[max Aj] < p. + y/n — la. 
i 

Finally, we use the notation introduced in the probability preliminaries to present some 
known bounds on the expected optimal makespan. So, if given a matrix of processing times t 
we denote its optimal makespan by OPT(t), we wish to bound Et [OPT(t)] (we omit dependence 
on t for clarity of presentation). Part of the notorious difficulty of the scheduling problem stems 
exactly from the lack of general, closed-form formulas for the optimal makespan. However, the 
following two easy lower bounds are widely used (see e.g. my- 

Observation 3. If the proeessing times are drawn from maehine-identieal distributions, then 
the expeeted optimal makespan is bounded by 


E[OPT] > max 


E 


maxTdl : n] 
. 3 


I i . 

J=1 


where Tj follows the distribution eorresponding to task j. 


3 Upper Bounds 

In this section we provide results on the performance of the VGG mechanism for the Bayesian 
scheduling problem for different assumptions on the number of tasks (compared to the ma¬ 
chines), and different distributional assumptions on their processing times. Our first result 
shows that VGG is O ) -approximate in the general case, without assuming identical 

tasks or even MHR distributions. We then consider some additional assumptions under which 
VGG achieves a constant approximation of the expected optimal makespan. In what follows, 
an allocation where all machines have the same workload will be called fully balaneed. 
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Theorem 4. VCG is O ^ j - approximate for the Bayesian seheduling problem with n iden- 

tieal maehines. 

As we will see later in Theorem |10[ this result is in general tight. In order to prove Theorem]^ 
we will make use of the following lemma: 

Lemma 3. If YCG is p-approximate for the prior free seheduling problem with identieal ma¬ 
ehines on inputs for whieh the optimal alloeation is fully balaneed, then VCG is p-approximate 
for the Bayesian seheduling problem where the maehines are a priori identieal. 

Proof. We will show that for any instance of Bayesian scheduling with a priori identical ma¬ 
chines, there exists a prior free scheduling instance with identical machines for which the ap¬ 
proximation ratio of VCG is at least the same. In fact, there exists such a prior free instance, 
for which the optimal allocation is fully balanced. 

Consider a Bayesian scheduling instance where tij ~ Vj for tasks j G [m] and machines 
i G [n]. Let p > 1 be such that Et[VCG(t)] = p ■ Et[OPT(t)]. Then, conditioning on the 
minimum processing times of the tasks, there exists an m-dimensional vector (tj,..., such 
that 


Et 


VCG(t) 


min til = t\ I\ ■ ■ ■ I\ min tim = C 
i i 


> p-Et 


OPT(t) 


min til = tl A ■■■ A nrin tim = C 
i i 


Notice that, once such a minimum processing time has been fixed for all tasks, the only 
randomization remaining within the expected makespan of VCG is the one with respect to the 
identities of the machines having processing time hj and the possible internal tie breaking; thus, 
if we let t* denote the time matrix where task j has processing time tij = tt for all machines i, 
it holds that 


Et 


VCG(t) 


min til = tl A ■■■ A min tim = C 
i i 


VCG(t*). 


Also, once we have fixed the smallest element in every column of an input matrix t (a column 
contains the processing times of a single task on all the machines), reducing all other values of 
a column j to be equal to that minimum value t* can only improve the optimal makespan, so 


Et 


OPT(t) 


min til = tl A ■■■ A min tim = C 
i i 


> OPT(t*). 


Combining the above, we get that indeed VCG(t*) > p • OPT(t*). 

It remains to be shown that, without loss, t* gives rise to an optimal (prior free) allocation 
that is fully balanced, that is all machines have exactly the same workload (equal to the optimal 
makespan). Indeed, if that is not the case, then for any machine whose workload is strictly 
below the optimal makespan, we can slightly increase the processing time hj of one of its tasks 
j without affecting the optimal makespan, while at the same time that increase can only make 
the performance of VCG worse. □ 

We are now ready to prove Theorem]^ Lemma [^essentially reduces the analysis of VCG for 
the Bayesian scheduling problem for identical machines to that of a simple weighted balls-in-bins 
setting: 

Proof of Theorem^ From Lemma [^ it is enough to analyze the performance of VCG on input 
matrices where the processing time of each task is the same across all machines and the optimal 
schedule is fully balanced. Without loss (by scaling) it can be further assumed that the optimal 
makespan is exactly 1. Then, since VCG is breaking ties uniformly at random, the problem 
is reduced to analyzing the expected maximum (weighted) load when throwing m balls with 
weights (rci,... ,Wm) = w (uniformly at random) into n bins, when ''^j — Then, by 
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Theorem that maximum load is upper bounded by the expected maximum load of throwing 
n (unit weight) balls into n bins, because the n-dimensional unit vector \n majorizes w: l^’s 
components sum up to n and also wj < 1 for all j £ [n] (due to the assumption that the optimal 
makespan is 1). By classic balls-in-bins results (see e.g. [28l[3l]), the expected maximum load 
of any machine is upper bounded by 0 • □ 

We now focus on the special but important case where both tasks and machines are a priori 
identical: 

Theorem 5. VCG is 2 ^ - approximate for the Bayesian seheduling problem with i.i.d. 

processing times drawn from a continuous MHR distribution. 

Proof. Let T be a random variable following the distribution from which the execution times tij 
are drawn. Following the notation introduced in the introduction, the workload of a machine 
i is given by the random variable Yi = Then, for the expected makespan E[maxj Yi] 

and any real s > 0 it holds that 


g..E[max, w < ^ E[maxe*^'] < =En 

2=1 2 = lji = l 


( 2 ) 


where we have used Jensen’s inequality based on the convexity of the exponential function, and 
the fact that for a fixed machine i the random variables Yij, j = 1,..., m, are independent (the 
processing times are i.i.d. and VCG allocates each task independently of the others). We now 
bound the term E[e^^i’^]: 


E[e^’^i-i] = E 


E 

.r=0 




7’! 


= i + E^' 

r=l 


E[yy 


J' I 


= i + -E 

n ^ 


r=l 


r=l 


where for the last inequality we have used the fact that the first order statistic of an MHR 
distribution is also MHR (Lemma and Lemma Then, by choosing s = s* = 2 .E[T[i-n]] 
get that 





< 1 


1 

n 


and (j^ yields 

E[maxyj] < In ^nE[e'^ _ 

< 21n ^ E[T[1 : n]] 

<2 In E[T[1 : n]] 

= 2 (^lnn+E[r[l : n]]. (3) 

But from Observation we know that E[OPT] > ^E[T[1 : n]] for the case of i.i.d. execution 
times, and the theorem follows. □ 

Notice that Theorem]^ in particular implies that VCG achieves a small, constant approxi¬ 
mation ratio whenever the number of tasks is slightly more than that of machines: 


Corollary 6. VCG is 4-approximate for the Bayesian scheduling problem with m > nlnn i.i.d. 
tasks drawn from a continuous MHR distribution. 
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The following theorem will help us analyze the performance of VCG for the complementary 
case to that of Corollary that is when the number of tasks is m < n In n. Recall the notion 
of fe-stretched distributions introduced in Definition [T] 

Theorem 7. VCG is A-approximate for the Bayesian scheduling problem with m < nlnn 
i.i.d. tasks drawn from a k-stretched MHR distribution. 

Proof. From Q we can deduce that the approximation ratio of VCG is upper bounded by 


in n H- 

n J 


m\ 2E[r[l : n]] 
E[OPT] 


< 


< 


m\ 2E[r[l :n]] 

^ n) E[T[1 :n][m:m]] 
m\ 2E[T[1 :n]] 
nj E[T[1 : n][n : n]] 


< (in n + 


< 


m 


n 


k{n) 


4 inn 
k{n) ’ 


where we have used Observation and the fact that n < m < n In n. 


□ 


In particular, we note that Theorem [^yields a constant approximation ratio for VCG for the 
important special cases where the processing times are drawn independently from the uniform 
distribution on [0,1] or any exponential distribution. Indeed, the uniform distribution on [0,1] 
as well as any exponential distribution is In-stretched. See Appendix for a full proof. We get 
the following, complementing the results in Corollary 

Corollary 8. VCG is A-approximate for the Bayesian scheduling problem with i.i.d. processing 
times drawn from the uniform distribution on [0,1] or an exponential distribution. 

We point out that the above corollary can not be generalized to hold for all MHR distribu¬ 
tions, as the lower bound in Theorem |10| implies. For example, it is not very difficult to check 
that by taking e —?■ 0 and considering the uniform distribution over [1,1 -|- e], no stretch factor 
k(n) = I7(lnn) can be guaranteed. 

For our final positive result, we present an improved constant bound on the approximation 
ratio of VCG when we have many tasks: 

Theorem 9. VCG is 1 -|- \/2-approximate for the Bayesian scheduling problem with m > 
tasks with i.i.d. processing times drawn from a continuous MHR distribution. 

Proof. We use Theorem]^ to bound the performance of VCG in this setting. In order to do so, 
we first bound the expectation and the variance of the makespan of a single machine. From 
for the workload V of any machine i we have: 


E[V] = : ^]] = 'inm : n]]. 

., ll 

J=1 


To compute the variance of the makespan of machine i, we note that the random variables Yij 
are independent with respect to j, for any fixed machine i and thus we can get 


m m 

Var[Vi] = 5: Var[V,,-] = ^ (e[Y,^^] - E[V,, 


i=i 

m 


i=i 


lit -I HI 

< : H'] 

j=i j=i j=i 


m . 


= — E[T[1 : n]^]. 
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We are now ready to use Theorem and bound the expected makespan: 


E[maxyi] < E[yi] + 

i ^ 

< ^ E[r[l : n]] + v^^E[r[l : nf] 

< — E[r[l : n]] + \/2\/mE[r[l : n]] 

< (1 + V2)^E[T[1 :n]] 

< (l + \/2)E[OPT], 

where the third inequality follows from Lemma (and Lemma , for the fourth inequality we 
use the assumption that m > v? and to complete the proof, the last inequality uses a lower 
bound on E[OPT] from Observation]^ □ 


4 Lower Bounds 


In this section we prove some lower bounds on the performance of VCG under different distri¬ 
butional assumptions on the processing times. In an attempt for a clear comparison of VCG 
with the mechanisms that were previously considered for the Bayesian scheduling problem (in 
0 ), we provide instances that lower bound their performance as well. 

Theorem 10. For any number of tasks, there exists an instanee of the Bayesian seheduling 
problem where VCG is not better than ^ approximate and the proeessing times are 

drawn from maehine-identieal continuous MHR distributions. 

Proof. Consider an instance with n identical machines and m tasks where for any machine 
i, task j has processing time tij = 1 with probability 1 for j = 1,... ,n — 1 and processing 
time tij = with probability 1 for j = n,... ,m. From classical results from balls-in- 

bins analysis (see also the proof of Theorem we can deduce that the expected maximum 
number of unit-weight tasks allocated to any machine by VCG, is Q On the other 

hand, there exists an allocation that achieves a makespan equal to 1, that is to allocate all of 
the m — n -|- 1 “small” tasks to a single machine and allocate each of the remaining unit-cost 
tasks to a different machine. The theorem follows by noticing that we can without loss replace 
these point-mass distributions on 1 and with uniform distributions over small intervals 

around that points. □ 


Notice that when the number of tasks equals that of the machines, i.e. m = n, then the bad 
instance for the lower bound of Theorem [iniis in fact an i.i.d. instance where tasks are identical 
as well and all tij's are drawn from the same distribution, and not just an instance with only 
machines being identical. However, if we restrict our focus only on discrete distributions, then 
we can strengthen that lower bound to hold for i.i.d. distributions for essentially any number 
of tasks and not only for m = n: 

Theorem 11. For any number of m = 0{ne^) tasks, there exists an instanee of the Bayesian 
seheduling problem where VCG is not better than Q, approximate and the tasks have 

i.i.d. proeessing times drawn from a discrete MHR distribution. 


Proof. Consider an instance with n identical machines and m tasks where the processing times 

1 

tij are drawn from {0,1} such that tij = 1 with probability ( = p and tij = 0 with 

probability 1 — p. Notice that this is a well-defined distribution, since for all m > n we have 

p < 1. Furthermore, it is easy to check that this distribution is MHR; its hazard rate at 0 is 

Pr[*O=0] — 1—P — 1 — r) and at 1 is — P — t 

Pr[tij>0] ~ 1 ~ P Pr[%>l] ~ p ~ 


10 












Next, let M be the random variable denoting the number of tasks whose best processing 
time over all machines is non-zero, that is 


M = \{j I mintij = 1}|. 

I 


Then M follows a binomial distribution with probability of success and m trials, since the 
probability of a task having processing time 1 at all machines (success) is p”, while there are m 
tasks in total. Given the definition for p, the average number of tasks that will end up requiring 
a processing time of 1 on every machine is E[M] = mp” = Also, we can derive that 


Pr [M > 3n] < e 


and 


Pr 


n 

M < - 
~ 7 


< e 


—n/8 


using Chernoff bound^ As we have argued before, we can use classical results from balls-in- 
bins analysis to bound the performance of VCG. So, if y < M < 3n, we know that the expected 


makespan will be 17 


In n 


1 , since each task has processing time 1 on all machines. That event 

happens almost surely, with probability at least 1 — e“"' — = 1 — o(l). 

On the other hand, we next show that the mechanism that simply balances the M “ex¬ 
pensive” tasks across the machines (by allocating — of them to every machine) achieves a 


constant makespan, hence providing a constant upper-bound on the optimal makespan: 


3n 


E[OPT] < Pr [M < 3re] • — • 1 -|- Pr [M > 3n] 


n 


m 

n 


• 1 < 3-he“” — -h 1 <4 


m 


n 


m 

ne' 


= 0 ( 1 ). 


□ 


Notice however that Theorem 11 still leaves open the possibility for continuous MHR dis¬ 
tributions to perform better (see also Theorem]^ and Gorollaryj^. 

We finally conclude with a couple of simple observations, for the sake of completeness. First, 
our initial requirement (see Section]^ for identical machines (which is a standard one, see |8]) 
is crucial for guaranteeing any non-trivial approximation ratios on the performance of VGG: 


Observation 12. There exists an instanee of the Bayesian seheduling problem where VCG 
is not better than n-approximate even when the tasks are identieally distributed aeeording to 
eontinuous MHR distributions. 


Proof. Assume ^ being an integer, and give as input the point-mass distributions tij = 1 — e 
and tij = 1 for all j G [m] and i = 2,3,..., n, where e G (0,1). Notice that the execution times 
are indeed identical across the tasks. The VCG mechanism allocates all jobs to machine 1, for 
a makespan of m • (1 — e), while the algorithm that assigns ™ jobs to each machine achieves a 
makespan of at most ^ • 1, resulting to a ratio of n as e —?■ 0. Without loss, the above analysis 
carries over even if we replace the point-mass distributions with uniform distributions over a 
small interval around the values 1 — e and 1. These distributions are MHR, which concludes 
the proof. □ 

We now present some lower bounds on the performance of the mechanisms analyzed by 
Ghawla et al. [S]. A definition of these mechanisms can be found in the introduction. The 
following demonstrates that the analysis of the approximation ratio for the class of bounded 
overload mechanisms presented in [5] is asymptotically tight: 

^Here we use the following forms, with /3i = 1 -f y/E and P 2 = ^'. for all /3i > 0 and 0 < /32 < 1, 

and Pr [X < (1 — ^ 2 )^] < e 2 


Pr [X > (1 -I- Pi)g\ < e 2 +Pr'^ 
for any binomial random variable with mean /r. 
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Observation 13. For any number of m > n tasks, there exists an instance of the Bayesian 
scheduling problem where a bounded overload mechanism with parameter c is not better than 
min{c^,n — 1}-approximate and the processing times are drawn from machine-identical contin¬ 
uous MHR distributions. 


Proof. Consider the instance of Theorem 10 and recall that the optimal makespan is equal 
to 1. We note that since each task has the same processing time at any machine, all possi¬ 
ble allocations such that no machine is assigned to more than tasks are valid outputs of 
bounded overload mechanisms with parameter c. Now consider the bounded overload mecha¬ 
nism which fixes an ordering of the machines and then breaks ties according to that ordering. 
This mechanism would allocate at least min{c^,n — 1} unit-cost tasks on the first machine in 
its ordering. □ 


The same instance can be used to bound the performance of the bounded overload mecha¬ 
nism with parameter c that breaks ties uniformly at random as well. Having sufficiently many 
tasks (m = H ( in in n )) that the mechanism behaves almost like the VCG mechanism 

while allocating the unit-cost tasks, assuming they are the first to be allocated. This gives a 
lower bound of H ( in’iiTn ) approximation ratio of this mechanism as well. 

Similar instances can provide lower bounds on the performance of the class of sieve and 
bounded overload mechanisms with parameters c, /3, and 6, even for the case of i.i.d. processing 
times. To see this notice that if all tasks have Uj = 1 with probability 1 on any machine 
(T[l : k] = 1 for any k), and we choose threshold /3 < 1 as is done in [S] for the case m < nlnn, 
then a sieve and bounded overload mechanism with parameters c,/3 < 1, and 6 immediately 
reduces to a bounded overload mechanism with parameter c on 5n machines. 


Acknowledgements: We want to thank Elias Koutsoupias for useful discussions. 
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A Omitted Proofs from Section [2] 

Lemma If T is a continuous MHR random variable then for any positive integer n, its first 
order statistic r[l : n] is also MHR. 

Proof. If T is a continuous real random variable with cdf F and pdf /, then the cdf and pdf of 
r[l : n] are = 1 — (1 — F{x))^ and f^^\x) = n/(x)(l — F{x))^~^, respectively. So, the 

hazard rate of T[1 : n] is 

f^^Kx) n(l - U(x))’^~V(x) f{x) 

- = - = 71 - 

1 -F(i)(a;) {l-F{x)Y 1 - F(x) ’ 

which is increasing since is increasing. □ 

Lemma For any continuous MHR random variable X and any positive integer r, E[A’’] < 
r!E[A]C 

Proof. For any positive integer s, denote the normalized moments ^ . Then from |3] 

p. 384] we know that for all integers i and t > s > 0, 



By selecting t = r, s = 1 and z = 0, this inequality gives A^Ag ^ < A]]. We get the desired 
inequality by noticing that A^ = E[Y”]/r!, Ai = E[A] and Aq = E[l] = 1. □ 

The continuity assumption in Lemma is essential, as it is demonstrated by the following 
example: consider a discrete random variable X over {0,1} with Pr [A = 0] = ^ £ and 

Pr [A = 1] = ^ — e, for some small e > 0. This distribution is MHR since its hazard rate at 0 
and 1 respectively is h{0) = pr|[^>oj = ^ + £ and h{l) = p}|x>p = 1- However, it is easy to see 
that E[A2] = E[A] = Pr [A = 1] = i - e and thus = E[A] < 
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B Proof of Corollary 

Throughout this section we will use the fact that if T is a random variable with cdf F then for 
any positive integer n, the cdf’s of the first and last order statistics T[1 : n] and r[n : n] are 
given by 

f(i)(x) = 1-(1-F(x))" and F^^'> (x) = (x), (4) 

respectively. 


Lemma 4. If T is a uniform random variable over [0,1], then for all positive integers n, m 

E[T[1 : n]] = - and E[T[1 : n] [m : m]] = 1 — mB (m, 1 H—j , 

rt |~ 1 y n J 

where B{x,y) = /q^ t^~^{l — dt is the beta funetion. 

Proof. If T is a uniformly distributed random variable over [0, 1] then its cdf is given by F(x) = 
X, X ^ [0,1]. The first equality is very easy, since from Q the cdf of r[l : n] is 1 — (1 — x)"’, thus 
its expectation is /g^(l—x)” dx = For the second one, again from a it is straightforward to 
see that the cdf of T[1 : n\[m : m] is [1 — (1 — x)"']™' so its expectation is f^ 1 — [1 —(1 —x)"']™' dx = 
1 — fgll — (1 — x)"]™' dx. Next we compute the value of this integral 


I(m)= f\l-(l-xrrdx. 

Jo 


We have: 


I(m) = [\l - (1 - x)”]”^-^! - (1 - x)^) dx 
Jo 

= /(m - 1) - / [1 - (1 - x)^r-\l - x)^ dx 
Jo 

= /(m - 1) - - / [1 - (1 - xTr-^ (1 - (1 - xY)' (1 - x) dx 
n Jo 

= /(m - 1) - — / ([1 - (1 - x)^ry (1 - x) dx 
nm Jo 

= /(„ _ 1) _ [(1 _ (1 _ i-)..)-” (1 _ d[i _ (1 _ dx 

nm nm Jo 

= I{m-l)-— C[l-{l-xYrdx 
nm Jo 


= I{m — 1)- 

nm 


meaning that 

1 

I{m) = -j—I(m —1) with /(I) = / 1 — (1 — x)*^ dx = 1— 

^ nm ^ 

Solving the above recurrence gives 


n + 1 


I{m) = 


2-3. m mF{m)F(l + ^ / i 

■ ' 7 - -V -7 - - v - -- —7 r = ■ 7 -= fnB m, 1 H— 

(i+i)'b+i).(”+i) '’(”+1+1) ^ 


where F denotes the (complete) gamma function. 


□ 
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Lemma 5. If T is an exponentially distributed random variable with parameter X, then for all 
positive integers n, m 


E[r[l : n]] 


1 

An 


and E[T[1 : n][m : m]] 


An ’ 


where Hm = ^ i + ''' + harmonic function. 

Proof. If T is exponentially distributed, then its cdf is given by F{x) = 1 — e~^^, x £ [0,oo), 
where A is a positive real parameter. The first equality is again easy, since from Q the cdf of 
T[1 : n] is 1 — thus its expectation is dx = For the second one, from Q 

it is straightforward to see that the cdf of T[1 : n][m : m] is (1 — so its expectation is 



_^-\nx)mdx 



Hm 

An 


by changing y = Xnx, 
by changing z = e~y, 
by changing w = 1 — z 


□ 


To conclude the proof of Corollary from Lemma we deduce that the stretch factor of 
the uniform distribution is 


E[r[l : n][n : n]] 
E[r[l : n]] 


= (n + 1) 


1 — nB n, 1 + 


1 


n 


and from Lemma the stretch factor for the exponential distribution with parameter A is 

E[T[1 : n][n : n]] 

E[r[l : n]] ^ 

It can be verified that both the above quantities are lower-bounded by In n. 
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